Dartmouth
LLM-Driven Corrective Robot Operation Code Generation with Static Text-Based Simulation
Wang, Wenhao, Rong, Yi, Li, Yanyan, Jiao, Long, Yuan, Jiawei
Recent advances in Large language models (LLMs) have demonstrated their promising capabilities of generating robot operation code to enable LLM-driven robots. To enhance the reliability of operation code generated by LLMs, corrective designs with feedback from the observation of executing code have been increasingly adopted in existing research. However, the code execution in these designs relies on either a physical experiment or a customized simulation environment, which limits their deployment due to the high configuration effort of the environment and the potential long execution time. In this paper, we explore the possibility of directly leveraging LLM to enable static simulation of robot operation code, and then leverage it to design a new reliable LLM-driven corrective robot operation code generation framework. Our framework configures the LLM as a static simulator with enhanced capabilities that reliably simulate robot code execution by interpreting actions, reasoning over state transitions, analyzing execution outcomes, and generating semantic observations that accurately capture trajectory dynamics. To validate the performance of our framework, we performed experiments on various operation tasks for different robots, including UAVs and small ground vehicles. The experiment results not only demonstrated the high accuracy of our static text-based simulation but also the reliable code generation of our LLM-driven corrective framework, which achieves a comparable performance with state-of-the-art research while does not rely on dynamic code execution using physical experiments or simulators.
- South America > Peru > Loreto Department (0.04)
- North America > United States > Massachusetts > Bristol County > Dartmouth (0.04)
- North America > United States > California (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.46)
Looking Forward: Challenges and Opportunities in Agentic AI Reliability
Xing, Liudong, Janet, null, Lin, null
The AI conversation can be traced as far back as Alan Turing's milestone paper published in 1950, which considered the fundamental question "Can machines think?" [1]. In 1956, AI got its name and mission as a scientific field at the first AI conference held at Dartmouth College [2]. Following AI's foundational period in the 1950s ~ 1970s, AI has evolved from early rule-based systems (1970s ~ 1990s), through classical machine learning and deep learning with neural networks (1990s ~ 2020s), to today's generative and agentic AI systems (since 2010s). Correspondingly, as a vital requirement of these systems, the reliability concept and concerns are also evolving, particularly in the interpretation of "required function" (see Table 1 in Chapter 10), based on the definition in standards like ISO 8402 "The ability of an item to perform a required function, under given environmental and operational conditions and for a stated period of time ". While a conventional AI system is concerned with providing stable and accurate classifications, predictions, or optimizations, a reliable generative AI system focuses on producing outputs that are trustworthy, consistent, safe, and contextually appropriate [3]. Building on both, a reliable agentic AI system should additionally conduct functions of reasoning, goal alignment, planning, safe adaption and interaction in dynamic and collaborative multi-agent contexts. The expansion of reliability concepts has introduced new challenges and research opportunities, as exemplified in Figure 1. In the following sections, we shed lights on these challenges and opportunities in building reliable AI systems, particularly, agentic AI systems.
- Europe > Sweden > Norrbotten County > Luleå (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Massachusetts > Bristol County > Dartmouth (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Robust Defense Strategies for Multimodal Contrastive Learning: Efficient Fine-tuning Against Backdoor Attacks
Hossain, Md. Iqbal, Sajeeda, Afia, Perla, Neeresh Kumar, Shao, Ming
The advent of multimodal deep learning models, such as CLIP, has unlocked new frontiers in a wide range of applications, from image-text understanding to classification tasks. However, these models are not safe for adversarial attacks, particularly backdoor attacks, which can subtly manipulate model behavior. Moreover, existing defense methods typically involve training from scratch or fine-tuning using a large dataset without pinpointing the specific labels that are affected. In this study, we introduce an innovative strategy to enhance the robustness of multimodal contrastive learning models against such attacks. In particular, given a poisoned CLIP model, our approach can identify the backdoor trigger and pinpoint the victim samples and labels in an efficient manner. To that end, an image segmentation ``oracle'' is introduced as the supervisor for the output of the poisoned CLIP. We develop two algorithms to rectify the poisoned model: (1) differentiating between CLIP and Oracle's knowledge to identify potential triggers; (2) pinpointing affected labels and victim samples, and curating a compact fine-tuning dataset. With this knowledge, we are allowed to rectify the poisoned CLIP model to negate backdoor effects. Extensive experiments on visual recognition benchmarks demonstrate our strategy is effective in CLIP-based backdoor defense.
- North America > United States > Massachusetts > Middlesex County > Lowell (0.14)
- North America > United States > Massachusetts > Bristol County > Dartmouth (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (4 more...)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > North Carolina (0.04)
- (4 more...)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > North Carolina (0.04)
- (4 more...)
Impute-MACFM: Imputation based on Mask-Aware Flow Matching
Liu, Dengyi, Wang, Honggang, Fang, Hua
Tabular data are central to many applications, especially longitudinal data in healthcare, where missing values are common, undermining model fidelity and reliability. Prior imputation methods either impose restrictive assumptions or struggle with complex cross-feature structure, while recent generative approaches suffer from instability and costly inference. We propose Impute-MACFM, a mask-aware conditional flow matching framework for tabular imputation that addresses missingness mechanisms, missing completely at random, missing at random, and missing not at random. Its mask-aware objective builds trajectories only on missing entries while constraining predicted velocity to remain near zero on observed entries, using flexible nonlinear schedules. Impute-MACFM combines: (i) stability penalties on observed positions, (ii) consistency regularization enforcing local invariance, and (iii) time-decayed noise injection for numeric features. Inference uses constraint-preserving ordinary differential equation integration with per-step projection to fix observed values, optionally aggregating multiple trajectories for robustness. Across diverse benchmarks, Impute-MACFM achieves state-of-the-art results while delivering more robust, efficient, and higher-quality imputation than competing approaches, establishing flow matching as a promising direction for tabular missing-data problems, including longitudinal data.
- North America > United States > Massachusetts > Bristol County > Dartmouth (0.14)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- North America > United States > Massachusetts > Worcester County > Worcester (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Temporally Consistent Unsupervised Segmentation for Mobile Robot Perception
Ellis, Christian, Wigness, Maggie, Lennon, Craig, Fiondella, Lance
Rapid progress in terrain-aware autonomous ground navigation has been driven by advances in supervised semantic segmentation. However, these methods rely on costly data collection and labor-intensive ground truth labeling to train deep models. Furthermore, autonomous systems are increasingly deployed in unrehearsed, unstructured environments where no labeled data exists and semantic categories may be ambiguous or domain-specific. Recent zero-shot approaches to unsupervised segmentation have shown promise in such settings but typically operate on individual frames, lacking temporal consistency-a critical property for robust perception in unstructured environments. To address this gap we introduce Frontier-Seg, a method for temporally consistent unsupervised segmentation of terrain from mobile robot video streams. Frontier-Seg clusters superpixel-level features extracted from foundation model backbones-specifically DINOv2-and enforces temporal consistency across frames to identify persistent terrain boundaries or frontiers without human supervision. We evaluate Frontier-Seg on a diverse set of benchmark datasets-including RUGD and RELLIS-3D-demonstrating its ability to perform unsupervised segmentation across unstructured off-road environments.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Massachusetts > Bristol County > Dartmouth (0.04)
- North America > United States > Maryland > Prince George's County > Adelphi (0.04)
- (2 more...)
- Energy > Power Industry > Utilities > Nuclear (0.67)
- Transportation > Ground > Road (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Robots > Locomotion (0.60)
S$^2$GPT-PINNs: Sparse and Small models for PDEs
Ji, Yajie, Chen, Yanlai, Koohy, Shawn
We propose S$^2$GPT-PINN, a sparse and small model for solving parametric partial differential equations (PDEs). Similar to Small Language Models (SLMs), S$^2$GPT-PINN is tailored to domain-specific (families of) PDEs and characterized by its compact architecture and minimal computational power. Leveraging a small amount of extremely high quality data via a mathematically rigorous greedy algorithm that is enabled by the large full-order models, S$^2$GPT-PINN relies on orders of magnitude less parameters than PINNs to achieve extremely high efficiency via two levels of customizations. The first is knowledge distillation via task-specific activation functions that are transferred from Pre-Trained PINNs. The second is a judicious down-sampling when calculating the physics-informed loss of the network compressing the number of data sites by orders of magnitude to the size of the small model.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- North America > United States > Massachusetts > Bristol County > Dartmouth (0.14)
- Asia > China > Shanghai > Shanghai (0.04)
- Europe > Portugal > Braga > Braga (0.04)
- Education (0.46)
- Government (0.46)
A Computational Approach to Improving Fairness in K-means Clustering
Zhou, Guancheng, Xu, Haiping, Xu, Hongkang, Li, Chenyu, Yan, Donghui
Clustering is an important problem in data mining. It aims to split the data into groups such that data points in the same group are similar while points in different groups are different under a given similarity metric. Clustering has been successfully applied in many practical applications, such as data grouping in exploratory data analysis, search results categorization, market segmentation etc. Clustering results are often used for further analysis or interpretation. However, directly applying results obtained from usual clustering algorithms may suffer from fairness issues-some cluster may favor data points from one of the subpopulations, i.e., having disproportionally more points. One example of 1 Figure 1: Illustration of the fairness issue in clustering, Points of different color indicate different traits on a sensitive variable, e.g., gender where blue indicates male and red female. Cluster 1 is dominated by females while Cluster 2 by males. Points with an arrow indicate that we might switch its cluster membership assignment to make the clusters less dominated by one subpopulation.
- North America > United States > Massachusetts > Bristol County > Dartmouth (0.14)
- Asia > Middle East > Jordan (0.05)
Adaptive Pruning of Deep Neural Networks for Resource-Aware Embedded Intrusion Detection on the Edge
Broggi, Alexandre, Bastian, Nathaniel, Fiondella, Lance, Kul, Gokhan
Artificial neural network pruning is a method in which artificial neural network sizes can be reduced while attempting to preserve the predicting capabilities of the network. This is done to make the model smaller or faster during inference time. In this work we analyze the ability of a selection of artificial neural network pruning methods to generalize to a new cybersecurity dataset utilizing a simpler network type than was designed for. We analyze each method using a variety of pruning degrees to best understand how each algorithm responds to the new environment. This has allowed us to determine the most well fit pruning method of those we searched for the task. Unexpectedly, we have found that many of them do not generalize to the problem well, leaving only a few algorithms working to an acceptable degree.
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)